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Abstract: 

We present the results of a simulation study into the properties of 12 
different estimators of the Hurst parameter, H, or the fractional integra- 
tion parameter, d, in long memory time series. We compare and contrast 
their performance on simulated Fractional Gaussian Noises and fractionally 
integrated series with lengths between 100 and 10,000 data points and H 
values between 0.55 and 0.90 or d values between 0.05 and 0.40. We apply 
all 12 estimators to the Campito Mountain data and estimate the accuracy 
of their estimates using the Beran goodness of fit test for long memory time 
series. 
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1. Introduction 



The s ubject of long- memory time series was brought to prominence by [Hjarstj 
(1951) and ha s subsequentl y received extensive attention in the literatu r e. See 



the volumes by Beran (Il994l). Embrechts and Maeiima d2002h and|Palmal (l20f)7n 
and the collections of lDoukhan et al.f "( 2005t and iRobinsonl ( 20051 ) and the ref- 
erences therein. 

Of critical importance in analyzing and modeling long memory time series is 
estimating the strength of the long-range dependence. Two measures are com- 
monly used. The parameter H , known as the Hurst or self-similarity parameter, 
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was introduced to applied statistics by iMandelbrot and van NessI (jl968l ) and 
arises naturally from the study of self-similar processes. The other measure, 
the fractional integration parameter, d, arises from the generalization of the 
Box- Jenkins ARIMA(p,d,q) models from integer to non-integer values of the in- 
tegration parameter d. Thi s general ization was acc omplished independently by 
Granger and Joveuxl (198(3) and by iHoskmg (1981). The fractional integration 
parameter d is also the discrete time counterpart to the self-similarity parameter 
H and the two are related by the simple formula H = d + 1/2. 

A number of estimators of H and d have been developed. These are usually 
validated by an appeal to some aspect of self-similarity, or by an asymptotic 
analysis of the distributional properties of the estimator as the length of the 
time series converges to infinity. 

A number of theoretical results on the asymptotic properties of various esti- 
mators have been obtained. The aggregated variance method was shown to be 
asympt otically biased of the order 1/ log N, where N is the number of observa- 
tions bv lGiraitis et al. ( 19991 ) who also showed the GPH ( Geweke and Porter-Hudak . 
19831 ) estimator was asymptotically normal and unbiased. iRobinsonl (jl994| ) proved 
the averaged periodog r am es timator was consistent under very mild conditions. 
Lobato and Robinson (1996) obtained its limiting distribution. The Peng et al 



1994) estimator was proved to be asymptotically unbiased by iTaqqu etal 



1995). Some theoreti cal p roperties of the R/S estimator have been ex a mined 



bv IMandelbrot! (|l975l ) and IMandelbrot and TaqquI |l979l ). IMandelbrot] (|l975l ) 
proved that the R/S statistic is robust to the increment pr ocess having a long- 



tailed distribution in the sense that E[Xf] — oo. However. iBhattacharva et al 



(1983) proved that the R/S statistic was not robust to departures from sta- 
tionarity. Thus for a short memory process with slowly decaying deterministic 
trend the R/S statistic will report an estimate of H which implies the pres- 
ence of long-memory. An esti mator based on wa velets was proved asymptoti- 
cally unbiased and efficient bv lAbrv et al. ( 19981 ). They also showed the tradi- 
tional variance type esti mators were fundamen tally flawed and could not lead 
to good estimators of H. IFox" and TaqquI (1986) proved the Whittle estimator 
was consis tent and asymp totically normal for Gau ssian long range depen dent 
sequences. Dalhaus ( 19891 ) proved the estimator of Fox and TaqquI ( 19861 ) was 
efficient. Further theoret ical results on the Whittle estimator can be found in 
Horvath and Shad (|l999l ). 

Because the finite sample properties of these estimators can be quite different 
from their asymptotic properties some previous authors have undertaken em- 
pirical compari sons of estimators o f H and d. Nine estimators were discussed in 
some detail bv lTaqqu et"aH (|l995h who carried out an empirical study of these 
estimators for a single series length of 10,000 data points, five values of both H 



and d, and 50 replications. iTeverovskv and TaqquI (|199^) showed in a simula- 



tion study that the differenced variance estimator was unbiased for five values 
of H (0.5, 0.6, 0.7, 0.8, and 0.9) for series with 10,000 observa t ions whereas 
the aggregated variance estimator was downwards biased. Jensen! ( 19991 ) under- 
took a comparison of two estimator s based on wavelets, o ne pro posed by Ijensen 
(|l999l) and the other proposed bv iMcCov and Waldenl (|l996l) . with the GPH 
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estimator for four series lengths (2 7 , 2 8 , 2 9 , 2 10 observations), five values of 
d and 1000 replications. They reported the wavelet estimators had lower mean 
squared erro rs (MSEs) than the GPH estimator for all d values and series lengths 
investigated. iJeong et alj (|2OO70 carried out a comparison of six estimators on 
simulated fractional Gaussian noises (FGNs) with 32,768 (2 15 ) observations, five 
values of H and 100 replications. 

Several of the above empirical investigations would have been limited by 
the then available computer power which has since increased considerably. We 
have extended these studies to a larger number of parameters, higher number 
of replications and 12 estimators as detailed in Section (|2|) below. 

The remainder of the paper is organized as follows. Section (|2|) gives details 
of the method. Section ((3]) presents the results. Section (j4|) applies the methods 
to the Campito Mountain data which is regarded as a standard example of a 
long memory time series. Section ([5]) contains the discussion and Section j6|) 
gives our conclusions and suggests avenues of future research. 



2. Method 



Ten H estim ators are implemented in the contri buted package fSeries of 
Wuertz ( 2005 ) for the popular statistical software R ( R Development Core Teaml . 



2005h . They are the absolute value, aggregated variance, boxed periodogram, dif- 



ferenced variance, Higuchi, Peng, periodogram, rescaled range, wavelet, and the 
Whitt l e. The wavelet esti mator is discussed in some de tail bv lAbrv and Veitch 
(|l998f) . lAbry et all (Il998h and lVeitch and Abrvl dl999h and the other nine are 
discu ssed bvlTaqqu et al ' [dl995h . Furth er, the GPH (IGeweke and Porter-Hudakl . 



th e contributed packa ge fracdiff of 



Il983l ) and lHaslett and Rafterv ( 1989 ) are implemented as estimators for d in 



Fralev et all (|2006h 



iTaaau et"ai] (19951 ) simulated FGNs and the corresponding discrete time frac- 
tionally integrated (FI(d)) series and found that each estimator performed sim- 
ilarly whether estimating H in simulated FGNs or d in simulated FI(d)s. For 
example, if an estimator was biased when estimating H it was also biased in a 
very similar manner when estimating d. Thus, with the exception of the GPH 
and Haslett-Raftery estimators, we only investigated each estimator's perfor- 
mance in estimating H for simulated FGNs. FGNs were generated using the 
function f gnSim in f Series. We ran 1000 replications of simulated FGNs with 
100 different lengths and eight different H values. The lengths were between 
100 and 10,000 data points in steps of 100. The H values were between 0.55 
and 0.90 in steps of 0.05. For each series H was estimated by each of these ten 
estimators. For each H value and series length we estimated the median, 75% 
and 95% confidence intervals empirically from the simulated data. The H or d 
estimates were sorted into ascending order and the median obtained by aver- 
aging the 500th and 501st values. Similar calculations were done for the upper 
and lower values of the 75% and 95% confidence intervals. 

For the GPH and Haslett-Raftery estimators we generated FI(d) series with 
the function f arimaSim in f Series over the range 0.05 to 0.40 in steps of 0.05. 
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The other details are the same as above. In the presentation of the results 
we converted the GPH and Haslett-Raftery d estimates to H equivalents to 
facilitate comparisons among the estimators. 

The simulations and estimations were performed on a SunBlade 1000 with a 
750Mhz UltraSPARC-Ill CPU with 2Gb of memory and a Sun Ultra 10 with a 
440Mhz UltraSPARC-Hi CPU and 1Gb of memory. 



3. Results 

To present the results in tabular form would require a very large amount of space. 
Thus we present them in graphical form. Figures |T]) through ([7]) present some 
of the results. Figures (JTJ) through ^ are presented with the vertical axis with 
a range of 1.2 H units to facilitate comparisons among the estimators' standard 
deviation of their estimates. It shou ld be noted that stationary long memory 



occurs in the range 0.5 < H < 1.0. Baillie) (|1996f ) states that for 1.0 < H < 1.5 



the series are non-stationary but mean reverting while for < H < 0.5 the 
series are anti-persistent. Figure ((7]) presents the mean squared error (MSE) as 
a function of series length. We report MSE for series lengths greater than or 
equal to 500 data points. Again the vertical axes all have the same range to 
facilitate comparisons. 

The results for the absolute value of the variance method are presented in 
Figures ^ (a) and (c). The absolute value of the variance method was unbiased 
at all series lengths when H was low (0.55 or 0.60) but became progressively 
biased and underestimated H as H increased. 

The results for the aggregated variance method are presented in Figures {1} 

(b) and (d). The aggregated variance method exhibited bias and underestimated 
H in short series when H was low. As H increased the estimator became in- 
creasingly biased at all series lengths examined. With H = 0.90 the true value of 
H lay above the upper 95% empirical confidence interval for all but the shortest 
series lengths. 

The results for the boxed periodogram method are presented in Figures ((2]) 

(a) and (c). The boxed periodogram method was developed specifically to deal 
with perceived problems with the periodogram estimator. Comparing the boxed 
periodogram with the unmodified periodogram method in Figures (0| (a) and 

(c) we can see that for FGNs where the series were short and H was high that 
the periodogram method was biased towards over estimating H. The boxed 
periodogram was biased towards underestimating H for almost all values of H 
and series lengths examined. 

The results for the differenced variance method are presented in Figures 

(b) and (d). The differenced variance method had one of largest confidence 
intervals of the estimators when the series were short but this slowly decreased 
as sample size increased. Only the GPH, periodogram and wavelet methods had 
a similarly wide confidence interval for short series. The differenced variance 
estimator exhibited bias towards over estimating H for any series with less than 
7,000 observations. The bias was very serious in the short series. For series longer 
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(a) H Est and CI Absval H=0.60 



(b) H Est and CI Aggvar H=0.60 
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Fig 1. Empirical confidence intervals for the H estimates with H = 0.60 and H = 0.90; (a) 
and (c) absolute value method, (b) and (d) aggregated variance estimator. 



than about 9,000 observations the estimator exhibited a small amount of bias 
towards underestimat ing H. 

The results for the iHiguchi (1988) estimator are presented in Figures © (a) 
and (c). The Higuchi was biased towards underestimating H but the magnitude 
of the bias appeared relatively independent of H. The width of the confidence 
interval of the estima te increased with i ncreasing H . 

The results for the iPeng et alj (|1994l ) estimator are presented in Figures ([3]) 
(b) and (d). The Peng estimator was biased toward under estimating H in the 
series lengths we investigated. This bias appeared to be independent of H but 
was very small though it appeared greater in short series. 

The results for the periodogram estimator were discussed above in conjunc- 
tion with the boxed periodogram estimator. 
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(a) H Est and CI BoxPer H=0.60 



(b) H Est and CI Diffvar H=0.60 




n 1 1 1 1 r 

2000 4000 6000 8000 
Series Length 

(c) H Est and CI BoxPer H=0.90 




~i 1 1 1 1 r 

2000 4000 6000 8000 




t 1 1 1 1 r 

2000 4000 6000 8000 
Series Length 

(d) H Est and CI Diffvar H=0.90 





Actual 

Mean 












75% CI 

95% CI 





Series Length 



1 1 1 1 1 T 

2000 4000 6000 8000 

Series Length 



Fig 2. Empirical confidence intervals for the H estimates with H = 0.60 and H = 0.90; (a) 
and (c) boxed periodogram method, (b) and (d) differenced variance estimator. 



The results for the R/S estimator are presented in Figures ([4} (b) and (d). The 
R/S estimator is of considerable historical interest because it was first proposed 
by Hurst and was used extensively in early studies of long-memory processes. 
However, as can be seen from Figures (g]) (b) and (d) the R/S estimator exhibited 
three problems; it was biased upwards when H was low, it was biased downwards 
when H was high, and the confidence interval of the estimate did not decrease 
with increasing series length once the series reached about 1000 observations. 

The results for the Whittle estimator are presented in Figures ([5]) (a) and 
(c). Compared to the other nine estimators implemented in f Series the Whittle 
estimator was remarkable for its narrow confidence interval. It only displayed a 
small amount of downwards bias when the series were short and H was high. 
There was an implementation issue in the software we used. The Whittle esti- 
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(a) H Est and CI Higuchi H=0.60 
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Fig 3. Empirical confidence intervals for the H estimates with H 
and (c) Higuchi estimator, (b) and (d) Peng estimator. 



0.60 and H = 0.90/ (a) 



mator would terminate with an error when H was low and the series contained 
only a few hundred observations. Thus in Figure ©(a) there was no data for 
series with less than 300 observations in the H — 0.65 results. 

The results for the wavelet estimator are presented in Figures ([5]) (b) and 
(d). The wavelet estimator was unbiased for all H values at series lengths over 
4,100 data points. The bias present in series shorter than 4,100 data points was 
very small. The availability of a new octave can be seen in Figures ([5]) (b) and 
(d) with each doubling of the series length. New octaves resulted in a series of 
steps in the reduction of the confidence interval of the estimate with increasing 
series length. The estimator had constant variance when the number of octaves 
was fixed. 

The results for the GPH estimator are presented in Figures ©(a) and (c). 
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(a) H Est and CI Per. H=0.60 



(b) H Est and CI R/S H=0.60 
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Fig 4. Empirical confidence intervals for the H estimates with H 
and (c) periodogram estimator, (b) and (d) R/S estimator. 



0.60 and H = 0.90/ (a) 



The GPH estimator exhibited a very small amount of bias towards overestimat- 
ing d at all series lengths examined. It had a very wide confidence interval which 
narrowed slowly as the series length increased. 

The results for the Haslett-Raftery estimator are presented in Figures © (b) 
and (d). The Haslett-Raftery did not report estimates of d less than zero (H < 
0.5). Hence for low d and short series the distribution was truncated on the low 
side at d = or H = 0.5 as in Figure §6§ (a). The Haslett-Raftery estimator 
was an excellent estimator with only small amounts of bias in the short series 
and had a narrow confidence interval. 

Figure ([7]) presents the MSEs for the estimators for H — 0.9 or d = 0.4 as 
appropriate. This is an alternative way to look at the data from the simulations. 
We only report MSEs for series of 500 data points and longer because of the high 
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(a) H Est and CI Whittle H=0.65 



(b) H Est and CI Wavelet H=0.65 
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(c) H Est and CI Whittle H=0.90 
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Fig 5. Empirical confidence intervals for the H estimates with H = 0.65 and H = 0.90; (a) 
and (c) Whittle estimator, (b) and (d) wavelet estimator. 



MSEs for some estimators in the short series. The Whittle and Haslett-Raftery 
both had low MSEs in all series greater than 500 data points in length. The 
step reductions in the MSE for the wavelet estimator can be clearly seen each 
time a new octave became available. 



4. Application: Campito Mountain Data 

The Campito Mountain bristlecone pine data is regarded as a standard exam- 
ple of a long memory time series. It is a 5405 year series of annual tree ring 
wi dths of bristlecone pines on Campito Mountain, California. It was studied 
bv lBaillie and Chund (|2002h who determined that an ARFIMA(0,0.44,0) model 



fitted the data best. The lack of additional short term correlation in the data 
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(a)HEst and CI GPH d=0.10 



(b) H Est and CI Haslett-Raftery d=0.10 
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Fig 6. Empirical confidence intervals for the H estimates with d = 0.10 (H = 0.60J and 
d = 0.40 (H = 0.90); (a) and (c) GPH estimator, (b) and (d) Haslett-Raftery estimator. 



means it is a good candidate for modeling with an FGN. 

The Campito Mountain data is available in the R package tseries as the data 
set camp. We applied the 12 estimators to this series and estimated the goodness 
of fit to an FGN for all estima tors where possible, except the Haslett-Raftery, 
using the te s t of | Beran (1992) as implemented in the R package longmemo of 



Beran et al 



( 2006H The Beran test is more powerful against under estimation 
of H than over estimation. The Beran test was unable to b e used for H values 
exceeding unity. It is important to note that iDeo and Chen showed that 

the asymptotic properties of the test as presented by iBeranl (|1992Tj were incor- 
rect. We subjected the Beran test to a simulation study, the results of which 
will be presented at a later data. This study showed that for sample sizes of 
the order studied here the Beran test over rejects the null hypothesis by a small 
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Fig 7. Mean squared errors (MSE) as a Junction of series length for all 12 estimators with 
d=0-4 for the GPH and Haslett-Raftery and H=0.9 for the other ten. MSEs are reported 
starting at a series of 500 data points, (a) Absolute Value, Aggregated Variance and Whittle, 
(b) Boxed Periodogram, Difference Variance and Higuchi. (c) Peng, Periodogram and R/S. 
(d) Wavelet, GPH and Haslett-Raftery. 
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Bcran 


CPU 


Expected 


Empirical 


Method 


H Est 


p- value 


Seconds 


H 


p- value 


Absolute Value 


0.862 


0.435 


0.19 


0.831 


0.70 


Aggregated Variance 


0.889 


0.577 


0.34 


0.821 


0.04* 


Boxed Periodogram 


0.914 


0.509 


0.09 


0.849 


0.01** 


Differenced Variance 


1.089 




0.21 


0.925 


0.01** 


GPH 


1.037 




1.43 


0.897 


0.16 


Haslett-Raftery 


0.947 


0.241 


0.17 






Higuchi 


0.966 


0.102 


19.65 


0.845 


< 0.001*" 


Peng 


0.936 


0.344 


18.46 


0.875 


< 0.001*** 


Periodogram 


1.007 




0.06 


0.908 


< 0.001*** 


Rescaled Range 


0.892 


0.577 


0.04 


0.816 


0.36 


Wavelet 


0.927 


0.421 


0.07 


0.889 


0.25 


Whittle 


0.876 


0.540 


1.05 


0.890 


0.15 


Table 1 



The first tw o columns of results presents the H estimates and p-values returned by the 
\Beran\ Xl 99 A ) test for the Campito Mountain data for each of the 10 estimators of H and 
the GPH and Haslett-Raftery estimators of d converted to H equivalent. CPU times are in 
seconds on the SunBlade described in the text. The expected H column is the expected value 
that the estimator would report the if the Campito data was an FGN with H = 0.89. The 
empirical p-value column is estimated empirically from the simulated data. 



amount (e.g. typically 6 percent at the 5 percent level). Thus for pragmatic test- 
ing of goodness of fit, the Beran test can still be used with appropriate caution, 
alternatively critical values can be obtained through simulation. 

The results are presented in Table (JTJ) . The Beran test indicated an H value 
close to 0.89 fitted this data best. The maximum p-value was 0.577 for values of 
H estimated by the aggregated variance and rescaled range estimators. Nine of 
the 12 esti mators report ed H or d values which lie in an acceptable range on the 
basis of the lBeranl ( 19921 ) test assuming we set our level of statistical significance 
at 0.05 to reject the null hypothesis of an FGN. The remaining three could not 
be tested. 

Given the results from our simulated FGNs there were some unexpected H 
estimates for the Campito data. On the basis of the simulations we expected the 
aggregated variance, absolute value, boxed periodogram, Higuchi, and rescaled 
range to return a low estimate for H . None of these estimators did so. As the 
Beran test reported that H = 0.89 yielded the best fit we used the median value 
from the simulations with series length 5400 and H = 0.90, and adjusted for the 
difference of 0.01 H units, to estimate the value of H which would be reported by 
each estimator if the data was from an FGN. This value is reported in Table |T]) 
as "Expected H" . The sixth column reports the empirically determined p-value 
for the actual estimate again using the simulated data. We do not report values 
for the Haslett-Raftery estimator as it estimates d not H. It is interesting that 
six of the estimators reported H estimates which are statistically significantly 
higher than their expected values. 

The estimator which had the least bias and narrowest confidence interval in 
the simulations with series length 5400 and H = 0.90, namely the Whittle, was 
marginally out performed by the aggregated variance and rescaled range judged 
on the basis of the Beran test. 
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The fourth column of Table (p} reports CPU times in seconds. With present 
day computer speeds estimation times on the Campito series are not an issue. 
Only four estimators required more than one second of CPU time on this 5405 
observation series. It is evident that some estimators which require longer com- 
pute times, such as the Higuchi and Peng, did not necessarily yield a more 
accurate estimation of H for this data. 



5. Discussion 



It is clear from the simulations that not all estimators are created equal. Long 
memory occurs in the range 0.5 < H < 1. Thus any estimator used to estimate 
the strength of the long memory needs to be both accurate and have a low 
variance. 

The boxed periodogram method was developed specifically to deal with the 
problem of having most of the points used to estimate H on the right-hand side 
of the graph. This w as believed to , possibly, cause bias in the periodogram esti- 
mator. iBeranl (|l994l . P 133) andEaaueFaD (|l995h outline some of the reasons 
such a method could be expected to be biased. In the series we investigated here 
the box periodogram estimator is inferior to the periodogram estimator it was 
intended to improve upon. 

The differenced variance method was developed to be robust to trends which 



were known to cause spurious long me mory in the R/S estimator (|Bhattacharva et al 
1983 ). We did not test its robustness. Teverovskv and Tag cm ( 19991) established 



that the differenced variance method had a higher variance than the aggregated 
variance method, a result support ed by our simulation study. In fairness to the 
method it must be pointed out that lTeverovskv and Taqqul fl999) did not intend 
for it to be used alone but rather in conjunction with the aggregated variance 
metho d to test f or the presence of shifting means or deterministic trends. 

The Higuchi ( 19881 ) estimator only indirectly estimates H. It estimates the 
fractal dimension, D, of a series by estimating its path length. As implemented 
it then converts the estimate of D to H by the simple relationship H = 2 — D. 
This should be borne in mind if a researcher wishes to estimate D rather than 
H as it is a simple matter to recover D from the H estimate report by this 
impl ementation . 

Taqqu et al. I (|l995l) give a detailed proof that the method of lPeng et al.l (|1994l ) 
is asymptotically unbiased. In the simulations the bias was never large but 
even at a sample size of 10,000 observations the estimator cannot be considered 
unbiased. However, its MSE approaches that of the Periodogram method as the 
series length increases which, in turn, is better than several others. 

The wavelet estimator is asymptotically unbiased. In the simulations the bias 
was always small and was unbiased for series with longer than or equal to 4,100 
observations. 
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6. Conclusions and Future Research 

Of the twelve estimators examined here the Whittle and Haslett-Raftery esti- 
mators performed the best on simulated series. If we require an estimator to 
be close to unbiased across the full range of H values for which long memory 
occurs and have a 95 percent confidence interval width of less than 0.1 H or 
d units (that is 20 percent of the range for H or d values in which long mem- 
ory is observed), then for series with less than 4,000 data points they were the 
only two estimators worth considering. It should be noted that these estimators 
did not meet these criteria until the series lengths exceeded 700 and 1000 data 
points respectively. For series with 4,000 or more data points, the Peng estima- 
tor gave acceptable performance. For series with more than 7,000 data points 
the periodogram estimator was a worthwhile choice. For series with more than 
8,200 data points the wavelet became a viable estimator. The remaining seven 
estimators did not give acceptable performance at any series lengths examined 
and are not recommended. 

The Higuchi estimator is useful if the researcher wishes to recover the fractal 
dimension of the time series. In contrast to the other estimators it provides 
useful information about a time series if the series is not an FGN (or FI(d)) 
series. As an estimator of H it is inferior to several others. 

The boxed periodogram method is clearly inferior to the periodogram method 
it was intended to improve upon for FGNs. Further research would be needed 
to test if it is more robust than the periodogram method in series with depar- 
tures from a pure FGN. This could be accomplished, for example, by simulating 
ARFIMA series with non-zero AR and MA components or series with structural 
breaks. 

The R/S estimator is of considerable historical interest but had a major 
deficiency in that its MSE plateaucd while all other estimators' MSEs decreased 
with increasing series length. Against this we must note that it was one of the 
two best performing estimators when applied to the Campito data when judged 
by the Beran test. 

The differenced variance estimator was the worst of the twelve estimators in 
short series. For series longer than 6,000 data points its MSE was better than 
the R/S and on a par wit h the absolute value, aggregate d variance and Higuchi 
methods. As noted above, Teverovskv and Taqqu ( 19991 ) do not recommend its 



use in isolation as it is part o f a test for shifting means or deterministic trends. 
Teverovskv and Tag qui ( 19991 ) also recommend the aggregated and differenced 



variance plots always be examined visually. We agree with these recommenda- 
tions. We did not test its robustness to shifting means or deterministic trends. 
So me numerical results on its pe rformance in these two situations can be found 
in iTeverovskv and Taqqu (|l999fl . 



The application to the Campito data six of the estimators reported statis- 
tically significantly different H estimates than expected based on the evidence 
from the simulated series. Although the fit of the Campito data to an FGN is 
good (p=0.577), these six estimators do not seem to be robust to whatever spe- 
cific departures from an FGN that are present in the data. This suggests that 
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a r esearch e r shou ld not rely on a single estimator when estimating H and that 
the Beran ( 1992t ) test should always be applied to test the goodness of fit of the 
data to an FGN, while being aware that its asymptotic properties are currently 
unknown. 

Because of the apparent lack of robustness, a useful avenue of future research 
would be to quantify the sensitivity of these estimators to various types of 
departures from an FGN, e.g. FGN series with a small number of shifts in mean 
or a small number of outlier data points. 
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